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ABSTRACT: 

In an image retrieval system, a database with a 
large number of images is 

searched to find one or more images meeting the 
specification of a user. This 
specification is given in the form of a query 
image. The system determines the 

similarity between the query image and a particular 
image from the database by 

comparing the color histograms of the two images. 
The histograms are treated 

as statistical distributions and the similarity is 

determined on the basis of 

an information theoretic measure of the 

distributions. In a first embodiment, 

the similarity is determined using the Kullback 

informational divergence of the 

two histograms. In a second embodiment, the 



01/07/2 004, EAST Version: 1.4.1 



similarity is based on the entropy 

of the distribution of similarity coefficients of 

the two histograms is used. 

9 Claims, 5 Drawing figures 

Exemplary Claim Number: 1 

Number of Drawing Sheets: 4 



KWIC 



Abstract Text - ABTX (1) : 

In an image retrieval system, a database with a 
large number of images is 

searched to find one or more images meeting the 
specification of a user. This 
specification is given in the form of a query 
image. The system determines the 

similarity between the query image and a particular 
image from the database by 

comparing the color histograms of the two images. 
The histograms are treated 

as statistical distributions and the similarity is 

determined on the basis of 

an information theoretic measure of the 

distributions. In a first embodiment, 

the similarity is determined using the Kullback 

informational divergence of the 

two histograms. In a second embodiment, the 

similarity is based on the entropy 

of the distribution of similarity coefficients of 

the two histograms is used. 

Brief Summary Text - BSTX (2) : 

The invention relates to an image retrieval 
system which includes a database 
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with candidate images , an entry unit for entering 
query image, and a first 

histogram unit for deriving a first query color 
histogram from the query image. 

Brief Summary Text - BSTX (3) : 

A second histogram unit derives a first 
candidate color histogram from a 
particular candidate image. Also a determining 
unit determines a first 

similarity between the particular candidate image 
and the query image on the 

basis of the first candidate color histogram and 
the first query color 

histogram, and a retrieval unit retrieves of the 
particular candidate image . 

Brief Summary Text - BSTX (7) : 

An image retrieval system and a method as 
described above, are known from 

the article "Tools and Techniques for Color Image 
Retrieval " , John R. Smith and 

Shih-Fu Chang, Proc. SPIE — Int. Soc. Opt. Eng 
(USA), Vol. 2670, pp. 

426-437. The image r e t r i e va 1 system includes a 
database with a large number of 

images. A user searching for a particular image 
specifies a query image as to 

how the retrieved image or images should lock like 

Then the system compares 
the stored images with the query image and ranks 
the stored image according to 

their similarity with the query image. The rankin 
results are presented to 

the user who may retrieve one or more of the 

images. The comparison of the 

query image with a stored image to determine the 
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similarity may be based on a 

number of features derived from the respective 
images. The article describes 

the usage of a color histogram as such a comparison 
feature. When using the 

RGB (Red, Green and Blue) representation of an 
image, a color histogram is 

computed by quantizing the colors within the image 
and counting the number of 

pixels of each color. To determine the similarity, 
a number of techniques are 

described to compare the two color histograms of 

the respective images. The 

histogram euclidean distance is a simple measure 
calculated by comparing 

identical bins in respective histograms. No 
cross-wise comparison is made 

between different bins which represent perceptually 
similar colors. 

Furthermore, techniques for determining a histogram 
intersection and techniques 

for determining a histogram quadratic distance are 
described. As an 

alternative to the histogram techniques, a 
comparison technique based on color 
sets is described. In this technique the color of 
a pixel is compared with a 

predetermined threshold. If the color is below the 
threshold, the pixel does 

not become a member of the set and otherwise it 
does become a member. A 

disadvantage is that a large number of pixels, all 
below the threshold, will 

not contribute in the comparison in any way. 
Furthermore, there is no 

discrimination between values above the threshold. 
The prior art techniques 

for determining the similarity between the 

candidate image and the query image 

are complex to execute and/or are occasionally not 
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adequate enough. 



Brief Summary Text - BSTX (9) : 

It is an object of the invention to provide an 
image retrieval system of the 

kind set forth with an improved mechanism for 
determining the similarity 

between the candidate image and the query image. 
This object is achieved in an 

image retrieval system having the determining unit 
arranged to, determine the 

first similarity on the basis of information 
conveyed by the first candidate 
color histogram in response to information 
requested by the first query color 

histogram . Determining the similarity between the 
respective images using an 

information theoretic measure is superior to the 
known techniques. The image 

retrieval system according to the invention is 
better able to establish the 

similarity between the query image and the images 
in the database. So, the 

image retrieval system according to the invention 
is superior in finding 

similar images and in avoiding images that are not 
similar enough. 

Furthermore, the calculation of the information 
theoretic measure requires less 

computational effort than the known techniques. 

Brief Summary Text - BSTX (10) : 

An embodiment of the image retrieval system 
according to the invention uses 

a Kullback informational divergence. The Kullback 
informational divergence is 

a measure for determining how different one 
statistical distribution is from 
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another statistical distribution. The inventor has 
realized that a color 

histogram can be treated as a statistical 
distribution and that the Kullback 
informational divergence can be applied for 
comparing the candidate color 
histogram with the query color histogram . 
Experiments have shown that 

retrieval of images on the basis of a similarity 
obtained from applying the 

Kullback informational divergence on the respective 
color histograms gives very 

good results* Furthermore, the calculation of the 
Kullback informational 

divergence requires less computational effort than 
the known techniques, which 

is very important since a large number of candidate 
images may need to be 
compared with the query image. 

Brief Summary Text - BSTX (11) : 

Another embodiment of the image retrieval system 
also considers entropy of 

similarity coefficients. By determining the 
entropy of the distribution of the 
similarity coefficients, an indication of the 
flatness of this distribution is 
obtained. Since a particular similarity 
coefficient indicates the similarity 
between the candidate color histogram and the query 
color histogram for the 

particular bin, the obtained flatness is a measure 
for the similarity of the 

candidate color histogram and the query color 
histogram over all bins. 

Experiments have shown that retrieval of images on 

the basis of a similarity 

based on the entropy measure gives very good 
results. Furthermore, the 
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calculation of the entropy requires less 
computational effort than the known 
techniques, which is very important since a large 
number of candidate images 

may need to be compared with the query image. 

Brief Summary Text - BSTX (12): 

A further embodiment of the image retrieval 

system according to the 

invention compares two color histograms of 

respective regions of the candidate 

image with two color histograms of corresponding 

regions of the query image, 

the spatial information in the respective images 
being employed when 

determining the similarity. This improves the 
accuracy of determining the 

similarity between the candidate image and the 
query image and a better 

discrimination among the images in the database can 
be achieved. 



Drawing Description Text - DRTX (4) : 

FIG. 2 schematically shows an image retrieval 

system according to the 

invention with multiple color histograms per image, 

Detailed Description Text - DETX (2) : 

FIG. 1 schematically shows an image retrieval 

system according to the 

invention. The system 100 includes a database 102 
with a potentially large 

collection 104 of images. A purpose of such a 
system is to retrieve from the 

collection one or more images that match the wishes 
of a user of the system. 

Those wishes are specified via a query image 106, 
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which the user can enter into 

the system via entry means 108. The entry unit may 
allow the user to compose 

the query image from a number of existing images or 
to create the query image 

from scratch. The system compares the query image 
with the candidate images in 

the database and determines for each candidate 
image how similar it is to the 

query image. The system ranks the candidate images 
according to the 

established similarity. The system 100 compares 
images on the basis of their 

color histogram . To this end, the system includes 
first histogram unit 110 to 

determine a query color histogram 112 from the 
query image 106. The process of 
determining a color histogram from an image is 
explained in FIG. 3 below. The 

system also includes second histogram unit 114 to 
determine a candidate color 

histogram 116 from a particular candidate image 
118. The first histogram unit 

and the second histogram unit may be integrated 
into one histogram means, which 

can act on the query image for generating the query 
color histogram and on the 

particular candidate image for generating the 
candidate color histogram 

respectively. The system further includes 
determining unit 120 to determine a 
similarity 122 on the basis of the query color 
histogram 112 and the candidate 

color histogram 116. Based on the similarity, the 
system presents a ranking of 

the candidate image on a display 12 6. The user may 
select an image from this 

ranking which is retrieved from the database via 
retrieval means 124 for 

further processing. This further processing may 
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include temporarily storing 

the image in a file 128 for further selection. 
This may be implemented as that 

the system retrieves a number of candidate images 

and stores these in the file 

128, from where the user makes a final selection as 
to which image is desired. 

In such a way of working, the system makes a first 
selection from the large 

collection in the database 102 and the user selects 

the image or images from 

the much smaller collection in file 128. 



Detailed Description Text - DETX (3) : 

In a first embodiment of the image retrieval 

system according to the 

invention, the two color histograms between which a 
similarity must be 

determined are treated as two probability 
distributions. The question as how 
similar the two histograms a can then be answered 
by measuring how different 

the one probability distribution is from other. 
This difference between two 

statistical distributions is called informational 
divergence or Kullback 

informational divergence and is calculated with the 
following equation : 

##EQU1## In which: Q(x) is the normalized query 
color histogram, 



Detailed Description Text - DETX (11) : 

In a second embodiment of the image retrieval 

system according to the 

invention, similarity coefficients are determined 
for each pair of 

corresponding bins of the two color histograms 

between which a similarity must 
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be determined. Subsequently the obtained 
collection of similarity coefficients 
is treated as a probability distribution and the 
question as how similar the 

two histograms are, is then answered by analyzing 
this probability 

distribution. In this embodiment, the similarity 
coefficients are calculated 

using the following equation: ##EQU4## In which: 
r.sub.i (P,Q) is the 

similarity coefficient between bin i of the 
candidate color histogram and bin i 
of the query color histogram, 



Detailed Description Text - DETX (24) : 

In the embodiments of the image retrieval system 
described above, a single 

color histogram is made from the whole image. 

Because of this, the spatial 

information from the image is lost and the 

comparison of two images reflects 

only global similarity. For example if a user 

enters a query image with a sky 

at the top and sand at the bottom, the retrieved 
images are expected to have a 

mix of blue' and beige, but not necessarily a sky 
and sand. A desirable result 

for the retrieved candidate images would be images 
with blue at the top and 

beige at the bottom . In order to achieve this 
result, a further embodiment of 

the system according to the invention determines a 
color histogram for a number 

of respective regions of the query image and 
compares these determined 

histograms with histograms of corresponding regions 
of the candidate image. 

The query image may be divided into regions using 
pre-fixed boundaries, e.g. 
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the division of the image into a number of 
rectangles. Furthermore, the 

regions may be indicated manually by the user 
taking into account important 

objects in the query image. In this way, the user 
forces that a histogram is 

made for a region comprising the object of 
interest. The choice of the region 
size is important since it governs the emphasis 
that is given to local 

information. In one extreme, the whole image is 
considered as a single region 

so that only global information is used for the 
comparison. In the other 

extreme, the region size matches the individual 
pixels. In one of the further 

embodiments of the retrieval system according to 
the invention, the images are 

divided into 4. times. 4 rectangular regions. 

Detailed Description Text - DETX (25) : 

FIG. 2 schematically shows an image retrieval 

system according to the 

invention with multiple color histograms per image. 

In this system, the first 
histogram means 110 determine a first query color 
histogram 202 of a first 

region of the query image 106 and a second query 
color histogram 204 of a 

second region of the query image 106. In the same 
way, the second histogram 

means 114 determine a first candidate color 
histogram 206 of a first region of 
the particular candidate image 118 and a second 
candidate color histogram 208 

of a second region of the particular candidate 
image 118. The example in FIG. 

2 shows 2 color histograms per image, but this is 
mainly for the purpose of 
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illustration since in practice the system will have 
more than 2 color 

histograms per image, for instance 8 or 16. In a 
subsequent step the 

determining means 120 of the system makes multiple 
pair-wise comparisons of the 

respective color histograms and determines a 
similarity for each comparison. 

These individual similarities are combined into one 
overall similarity 

indicating how similar the candidate image and the 
query image are, taking into 

account the local information. The determining 
means determine a first 

similarity 210 on the basis of the first query 
color histogram 2 02 and the 

first candidate color histogram 206. This first 
similarity 210 indicates how 

similar is the first region of the query image 106 
to the first region of the 

candidate image 118. The determining means further 
determine a second 

similarity 212 on the basis of the second query 
color histogram 204 and the 

second candidate color histogram 208. This second 
similarity 212 indicates how 

similar is the second region of the query image 106 
to the second region of the 

candidate image 118. Subsequently, the determining 
means determine an overall 

similarity 214 on the basis of the first similarity 
210 and the second 

similarity 212. This overall similarity 214 
indicates the similarity between 

the query image 106 and the candidate image as a 
whole, taking into account the 

local information captured through the division in 
regions. The overall 

similarity 214 is used to rank the candidate images 
stored in the database 102. 
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Detailed Description Text - DETX (50) : 

FIG. 5 shows an overview of the method according 
to the invention. In a 

first step 502, a query image is obtained 
containing the wishes of the user. 

This image may be composed from existing images or 
may be sketched by the user, 

possibly on the basis of an existing image. Then 
in a second step 504, a query 

color histogram of the query image is determined. 
This query color histogram 

will be used in comparing the query image with 
candidate images from a 

database. In a third step 506, a candidate color 
histogram of one of such 

candidate images is obtained. Preferably this 

candidate color histogram has 

been prepared in advance at the moment the 

candidate image had been stored in 

the database. Then obtaining the candidate color 

histogram now, comes down to 

simply retrieving the histogram. Alternatively, 
the candidate color histogram 

could be created at this instant, i.e. at the time 
when it is needed. When the 

candidate color histogram has been obtained, the 
similarity between the query 

image and the candidate image is determined in a 
determining step 508. If in a 

comparison step 510 it is ascertained that the 
images are similar enough, the 

particular candidate image is retrieved from the 
database in retrieval step 

512. The particular candidate image may be 
directly presented to the user or 
may be temporarily stored in a file for later 
inspection. Then in step 514 it 

is determined whether all candidates images in the 
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database have been dealt 

with. If this is not the case, a candidate color 
histogram of a next candidate 

image is obtained in step 506 and the process is 
repeated for this next 
candidate image. 

Claims Text - CLTX (9) : 

retrieval means for retrieval of the selected 
candidate image, selection 

thereof being determined by the determining means 
on the basis of information 

conveyed by the first candidate color histogram in 

response to information 

requested by the first query color histogram . 

Claims Text - CLTX (10) : 

2 . An image retrieval system according to claim 
1, wherein the determining 
means are arranged to determine the first 
similarity on the basis of the 

Kullback informational divergence between the first 

candidate color histogram 

and the first query color histogram . 

Claims Text - CLTX (14) : 

4 . An image retrieval system according to claim 
1, wherein the determining 
means are arranged to determine the first 
similarity on the basis of the 

entropy of the distribution of said similarity 
coefficients over the bins of 

the first candidate color histogram and the bins of 

the first query color 

histogram. 
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US-CL-CURRENT: 3 82/165, 382/162 
ABSTRACT : 

A color correlogram (10) is a representation 
expressing the spatial 

correlation of color and distance between pixels in 
a stored image. The color 

correlogram (10) may be used to distinguish objects 
in an image as well as 

between images in a plurality of images. By 
intersecting a color correlogram 

of an image object with correlograms of images to 
be searched, those images 

which contain the objects are identified by the 
intersection correlogram. 
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18 Claims, 4 Drawing figures 



Exemplary Claim Number: 1 
Number of Drawing Sheets: 4 
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Brief Summary Text - BSTX (6): 

Color histograms are commonly used as feature 
vectors for image retrieval 

and for detecting cuts in video processing because 
histograms are efficient to 
compute and insensitive to camera motions. 
Histograms are not robust to local 

changes in images, so false positives easily occur 
using histograms. Though 

the histogram is easy to compute and seemingly 
effective, it is liable to cause 

false positive matches, especially where databases 
are large, and is not robust 

to large appearance changes. Another disadvantage 
of the color histogram is 

insensitivity to illumination changes. Recently, 
several approaches have 

attempted to improve upon the histogram by 
incorporating spatial information 

with color. Many of these methods are still unable 
to handle large changes in 

appearance. For instance, the color coherence 
vector (CCV) method uses the 

image feature (s), e.g. spatial coherence of colors 
and pixel position, to 

refine the histogram. These additional features 
improve performance, but also 

require increased storage and computation time. 
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Brief Summary Text - BSTX (24): 

Experimental evidence shows that the color 
correlogram outperforms not only 
color histograms but also more recent histogram 
refinements such as the color 

coherence vector method for image indexing and 
retrieval . 
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ABSTRACT: 

A system and method for manipulating a histogram 
to perform query by color 

statistics are described. In one embodiment, color 
elements of an object based 

upon a color space are quantized and a histogram is 
created from the color 

elements. Further, the histogram is manipulated 
and displayed. In one 

embodiment, a database of images is queried by 
comparing the edited histogram 

with at least one existing histogram maintained in 
the database and at least 

one image corresponding to the at least one 
existing histogram is displayed. 

38 Claims, 15 Drawing figures 

Exemplary Claim Number: 1 



01/07/2004, EAST Version: 1.4.1 



Number of Drawing Sheets: 
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Brief Summary Text - BSTX (10) : 

A color histogram may provide a convenient 
graphical interface to the 

retrieval of images that are similar in overall 
color content and provides a 

definition of the color representation of an image. 

The color histogram of an 
image describes its color distribution. Every 
pixel in the image corresponds 

to a point in a three-dimensional color space in 

which a similar image set may 

be selected based on the color distribution 
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ABSTRACT: 

An efficient method for retrieving an image 
using multiple features for an 
image subregion is disclosed. The present 
invention obtains a regional 

representative color for the subregions and uses 
the regional representative 

color during a retrieval of a similar image if the 
regional representative 

color is reliable. Otherwise, a feature 
information other than the regional 
representative color is used. 

19 Claims, 4 Drawing figures 

Exemplary Claim Number: 1 

Number of Drawing Sheets: 2 
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Brief Summary Text - BSTX (7): 

In another image retrieving method, the colors 
of all pixels in a subregion 

are represented with a plurality of values, such as 
color histogram and the 

image is represented using the values as the 
subregion information to retrieve 

an image . Although this method may perform better, 
due to the use of the 

plurality of color values, a longer retrieval time 
is necessary. For example, 

when using a color histogram of nth dimension, a 

comparison of n elements is 

required. 

Detailed Description Text - DETX (11) : 

FIG. 1 shows a flowchart of a method for 
re tr ieving an image in accordance 
to the preferred embodiment of the present 
invention, where an image is divided 
in length and width directions by a fixed ratio, 
i.e. into grid regions, and 

each grid region or cell units. Thereafter, a 
regional representative color C 
for each cell and a confidence measure of the 
regional representative color C, 

a set of main colors such as a histogram H, and 

texture information such as an 

edge direction component are extracted 

(S10. about. S30) . 

Detailed Description Text - DETX (19) : 

Although the preferred method for retrieving an 
image uses a combination of 
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both a color histogram and a texture information in 

a similarity determination 

when the confidence measure of a regional 

representative color C is less than 

the threshold value, only one additional feature 

may be used by setting one of 

either A or B to zero. Alternatively, one of steps 
S20 or S30 for obtaining a 

color histogram or a texture histogram may be 
eliminated to simplify the 

system. In such case, the weight value for the 
corresponding feature 
information would be set to zero. 
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ABSTRACT : 

A method for characterizing an image where a 
number of test areas of 

predefined shape and size are located on the image. 

The color or the texture 
of the image over each of the test areas is 
quantified. The image can be 

characterized by statistical descriptions of the 

frequency distribution of 

color or texture of the test areas. 

8 Claims, 12 Drawing figures 

Exemplary Claim Number: 1 
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Brief Summary Text - BSTX (6): 

Several processes have been proposed which 
attempt to preserve some of the 
spatial information that is discarded in the 
construction of a color histogram . 
Pass et.al in the paper entitled HISTOGRAM 
REFINEMENT FOR CONTENT BASED IMAGE 
RETRIEVAL proposed refining the color histogram 
with color coherence vectors . 

In this process the coherence of the color of a 
picture element in relation to 

that of other picture elements in a contiguous 
region is determined. Even 

though the number of picture elements of each color 
is equal and, therefore, 

the color histograms are identical for two images , 
differences between features 

in the images will mean that the numbers of picture 
elements of each color 

which are color coherent will vary. Color 
coherence vectors do embed some 
spatial information in the descriptors. 
Unfortunately, they require at least 
twice as much additional storage space as a 
traditional histogram. 

Brief Summary Text - BSTX (7) : 

Rickman et al . in the paper entitled 
CONTENT-BASED IMAGE RETRIEVAL USING 
COLOUR TUPLE HISTOGRAMS proposed image 
characterization by construction of a 
histogram of the color hue at the vertices of 
randomly located triangular color 
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tuples. Since the vertices of the triangular 
tuples are spaced apart, some 

spatial information is retained. Unfortunately, it 
is difficult to determine 

the dominant color of an image from the color tuple 
data. Further, the 

retained spatial information is difficult to 

interpret in a normal sense, 

therefore making it difficult to use the 

information for indexing an image 

database . 

Other Reference Publication - OREF (10) : 

Content-Based Image Retrieval Using Color 
Tuple -Histograms ; Brunei 

University, Middlesex, UK; 1/96; pp. 2-7. 
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US-CL-CURRENT: 382/165, 382/199 , 382/305 , 
707/6 

ABSTRACT: 

A method for representing an image in an image 
retrieval database first 

separates and filters images to extract color and 
texture features , The color 

and texture features of each image are partitioned 
into a plurality of blocks. 

A joint distribution of the color features and a 
joint distribution of the 

texture features are estimated for each block. The 
estimated joint 

distributions are stored in the database with each 

image to enable retrieval of 

the images by comparing the estimated joint 
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distributions . 

17 Claims, 7 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 5 



KWIC 



Brief Summary Text - BSTX (5) : 

Most of the current content-based image 
retrieval systems rely on global 
image characteristics such as color and texture 
histograms , e.g., see 

AltaVista's "Photof inder . " While these simple 
global descriptors are fast and 

often do succeed in partially capturing the essence 

of the user's query, global 

descriptors often fail due to the lack of 

higher-level knowledge about what 

exactly was of interest to the user in the query 

image, i.e., user-defined 

content. Recently, there has been a gradual shift 
towards spatially-encoded 

image representations . Spatially-encoded 
representations range widely from 

fixed image partitioning, as in the "ImageRover, " 
to highly local 

characterizations like the "color correlograms , " 
please see Sclaroff et al. in 

"Imagerover: A content-based image browser for the 
world wide web," Proc. IEEE 

Workshop on Content-Based Access of Image and Video 
Libraries, June 1997, and 

Huang et al. in "Image indexing using color 
correlograms, "Proc. IEEE Conf. 

on Computer Vision and Pattern Recognition, 1997. 
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US-PAT-NO: 6351556 

DOCUMENT- IDENTIFIER : US 6351556 Bl 

TITLE: Method for automatically 

comparing content of images for 

classification into events 

DATE-ISSUED: February 26, 2002 

INVENTOR-INFORMATION : 

NAME CITY 

STATE ZIP CODE COUNTRY 

Loui; Alexander C. Penfield 

NY N/A N/A 

Pavie; Eric S. Rochester 

NY N/A N/A 

US-CL-CURRENT: 382/164, 382/168 

ABSTRACT: 

A method for comparing image content of first 
and second images , the method 

comprises the steps of extracting a portion of both 
the first and second images 

both of which portions are determined to include a 
main subject area of each 

image; dividing the main subject area of the .images 
into a plurality of blocks; 

computing a color histogram for one block in each 
image; computing a histogram 

intersection value between the block of the first 
image arid the block of the 

second image; and determining a first threshold 
value for the computed 

histogram intersection value that determines 
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similarity between the block in 

the first image and the block in the second image. 
18 Claims, 14 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 11 



KWIC 



Abstract Text - ABTX (1) : 

A method for comparing image content of first 
and second images, the method 

comprises the steps of extracting a portion of both 
the first and second images 

both of which portions are determined to include a 
main subject area of each 

image; dividing the main subject area of the images 
into a plurality of blocks; 

computing a color histogram for one block in each 
image; computing a histogram 

intersection value between the block of the first 
image and the block of the 

second image; and determining a first threshold 
value for the computed 

histogram intersection value that determines 
similarity between the block in 

the first image and the block in the second image. 

Brief Summary Text - BSTX (4): 

Pictorial images are often classified by the 
particular event, subject or 

the like for convenience of retrieving , reviewing , 
and albuming of the images . 

Typically, this has been achieved by manually 
segmenting the images, or by the 
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below-described automated method. The automated 

method includes grouping by 

color, shape or texture of the images for 

partitioning the images into groups 

of similar image characteristics. 

Brief Summary Text - BSTX (8) : 

The present invention is directed to overcoming 
one or more of the problems 

set forth above. Briefly summarized, according to 
one aspect of the present 

invention, the invention resides in a method for 
comparing image content of 

first and second images, the method comprising the 
steps of: (a) extracting a 

portion of both the first and second images both of 
which portions are 

determined to include a main subject area of each 
image; (b) dividing the main 

subject area of the images into a plurality of 
blocks; (c) computing a color 
histogram for one block in each image; (d) 
computing a histogram intersection 

value between the block of the first image and the 
block of the second image; 

and (e) determining a first threshold value for the 
computed histogram 

intersection value that determines similarity 

between the block in the first 

image and the block in the second image. 



\ 

I 
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US-PAT-NO: 



5751286 



DOCUMENT-IDENTIFIER: US 5751286 A 

TITLE: Image query system and 

method 

DATE-ISSUED: May 12, 1998 

INVENTOR-INFORMATION : 



NAME 




CITY 


*""*( m TV m T — 1 l""7 T" T*k /*~N T— \ T — 1 

STATE ZIP CODE 


COUNTRY 


Barber; Ronald Jason 




San Jose 


CA N/A 


N/A 




Beitel; Bradley James 




Woodside 


CA N/A 


N/A 




Equitz; William Robinson 




Palo Alto 


CA N/A 


N/A 




Flickner; Myron Dale 




San Jose 


CA N/A 


N/A 




Niblack; Carlton Wayne 




San Jose 


CA N/A 


N/A 




Petkovic; Dragutin 




Los Gatos 


CA N/A 


N/A 




Work; Thomas Randolph 




San Fancisco 


CA N/A 


N/A 




Yanker; Peter Cornelius 




Mountain View 


CA N/A 


N/A 





US-CL-CURRENT: 345/835, 345/838 , 345/968 , 
382/209 , 382/220 , 382/305 

, 707/4 , 707/6 

ABSTRACT: 

Images in an image database are searched in 
response to queries which 

include the visual characteristics of the images 
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such as colors, textures, 

shapes, and sizes, as well as by textual tags 
appended to the images. Queries 

are constructed in an image query construction area 
in response to values of 

representations of the visual characteristics and 
to locations of the 

representations in the image query construction 
area . 

31 Claims, 18 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 13 
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Brief Summary Text - BSTX (11) : 

The second case is when a color image is 
specified, and similar images are 
to be retrieved . A method for doing this is 
described in M. J. Swain, et al. 

"Color Indexing", International Journal of Computer 
Vision, 7 (1) :ll-32. 1991. 

The method uses " histogram intersection", in which 
a color histogram is 

computed for each image in the database, and a 
corresponding color histogram is 

computed for the query image. These histograms are 
computed over a quantized 

version of the available color space, giving, for 
example, 256 bins in the 

color histogram . A measure of similarity is 

defined for two histograms, and a 

query is run by computing the similarity between 

the query image histogram and 

the histogram of each image in the database. 
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Brief Summary Text - BSTX (12) : 

A more sophisticated method for retrieving 
images similar to a given image 

is given in Mikihiro Ioka. "A Method of Defining 
the Similarity of Images on 

the Basis of Color Information", Technical Resort 
RT-0030, IBM Tokyo Research 

Lab, 1989. Here, each image in the database 
(actually, the subimage of each 
image containing a single, dominant object the 
image) is partitioned into 

blocks, for example. 25 blocks. Within each 
block, the reduced bucket 

histogram, h, (say, 256 buckets) is computed. Given 
a query image or object, is 

also partitioned into the same number of blocks and 
the histograms computed. A 

similarity measure s (h. sub. query . sbsb. -- . sub. image 
h . sub . database . sbsb sub . item) is defined on the 
color histograms computed in 

the blocks, and the measure is extended to images 
as : 
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US-PAT-NO: 6285995 

DOCUMENT-IDENTIFIER: US 6285995 Bl 

TITLE: Image retrieval system using 

a query image 

DATE-ISSUED: ■ September 4, 2001 

INVENTOR-INFORMATION : 

NAME CITY 

STATE ZIP CODE COUNTRY 

Abdel-Mottaleb; Mohammed S. Ossining 

NY N/A N/A 

Krishnamachari; Santhana Ossining 

NY N/A N/A 

US-CL-CURRENT: 707/3, 707/2 , 707/6 

ABSTRACT: 

An image retrieval system contains a database 
with a large number of images. 

The system retrieves images from the database that 
are similar to a query image 

entered by the user. The images in the database 
are grouped in clusters 

according to a similarity criterion so that 
mutually similar images reside in 
the same cluster. Each cluster has a cluster 
center which is representative 

for the images in it. A first step of the search 
to similar images selects the 

clusters that may contain images similar with the 
query image, by comparing the 
query image with the cluster centers of all 
clusters. A second step of the 

search compares the images in the selected clusters 
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with the query image in 

order to determine their similarity with the query 
image . 

10 Claims, 7 Drawing figures 
Exemplary Claim Number: 1 
Number of Drawing Sheets: 4 

KWIC 

Brief Summary Text - BSTX (5) : 

An image retrieval system and a method as 
described above, are known from 

the article "Tools and Techniques for Color Image 
Retrieval " , John R. Smith and 

Shih-Fu Chang, Proc. SPIE — Int. Soc. Opt. Eng 
(USA), Vol. 2670, pp. 

426-437. The image retrieval system comprises a 
database with a large number 

of images . A user searching for a particular image 
specifies a query image as 

to how the retrieved image or images should look 
like. Then the system 

compares the stored images with the query image and 
ranks the stored images 

according to their similarity with the query image. 

The ranking results are 
presented to the user who may retrieve one or more 
of the images . The 

comparison of the query image with a stored image 
to determine the similarity 

may be based on a number of features derived from 
the respective images. The 

image feature or features used for comparison are 
called a feature vector. The 

article describes the usage of a color histogram as 
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such a feature vector. 

When using the RGB (Red, Green and Blue) 

representation of an image, a color 

histogram is computed by quantizing the colors 

within the image and counting 

the number of pixels of each color. To determine 
the similarity, a number of 

techniques are described to compare the two color 
histograms of the respective 

images. An example of such technique is the 

histogram intersection, where the 

similarity is the sum over all histogram bins of 

the minimal value of the pair 

of corresponding bins of the two histograms. 

Brief Summary Text - BSTX (16) : 

An embodiment of the image retrieval system 
according to the invention is 

defined in claim 2. The similarity between images 
may be determined on the 

basis of their color histograms . The average of 
the respective histograms of a 

number of representative images of a cluster can 
advantageously be used as a 
representation for the whole cluster. 

Detailed Description Text - DETX (3) : 

FIG. 1 schematically shows an image retrieval 

system according to the 

invention. The system 100 comprises a database 102 
with a potentially large 

collection of candidate images. A purpose of the 
system is to retrieve from 

the collection one or more images that match the 
wishes of a user of the 

system. The system performs a content based search 
in the collection of the 

images, i.e. the content of an image is used as the 
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search or ranking 

criterion, as opposed to systems that search on the 
basis of keywords in 

annotation added to the images. The images in the 
database according to the 

invention are grouped in clusters, of which 
clusters 104, 106 and 108 are 

shown. Images of a cluster are to a certain extent 
similar with each other. 

For instance cluster 108 contains images 110, 112, 
114 and 116 which are 

according to a certain measure similar with each 
other. The content of an 

image is represented in the system by a socalled 
feature vector, e.g. image 116 

has a feature vector 118. In the system according 
to the invention, a color 

histogram of the image is used as feature vector 
but the type of feature vector 
is not essential to the invention and other 
measures expressing the 

characteristics of the content of an image may be 
used. The feature vector may 

be stored in the database with the image itself or 
at some other location in 

the database, e.g. in a table with feature vectors 
of other images including a 

reference to the image. A cluster has a cluster 
center representing the 

contained images, e.g. cluster 108 has cluster 
center 120. In the system 

according to the invention, the cluster center is 
the average of the color 

histograms of a number of representative images in 
the cluster. Another kind 

of cluster center may be used, e.g. the feature 
vector of a single image which 
is chosen as the representative image for all 
images in the cluster. 
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Detailed Description Text - DETX (6): 

In an embodiment of the image retrieval system 
according to the invention, 

the feature vector of an image is its color 
histogram . The similarity measure 
between two images is calculated on the basis of 
the two color histograms of 

these images by determining the so-called histogram 
intersection. This 

technique is described in the article "Tools and 
Techniques for Color Image 

Retrieval " , John R. Smith and Shih-Fu Chang, Proc. 
SPIE — Int. Soc. Opt. Eng 
(USA), Vol. 2670, pp. 426-437. 

Detailed Description Text - DETX (20) : 

In a still further embodiment of the image 
retrieval system according to the 

invention, similarity coefficients are determined 
for each pair of 

corresponding bins of the two color histograms 

between which a similarity must 

be determined. Subsequently the obtained 

collection of similarity coefficients 

is treated as a probability distribution and the 

question as how similar the 

two histograms are, is then answered by analyzing 
this probability 

distribution. In this embodiment , the similarity 

coefficients are calculated 

using the following equation: ##EQU4## 

Detailed Description Text - DETX (41) : 

In the embodiments of the image retrieval system 
described above, a single 

color histogram is made from the whole image. 
Because of this, the spatial 
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information from the image is lost and the 
comparison of two images reflects 
only global similarity. For -example if a user 
enters a query image with a sky 

at the top and sand at the bottom, the retrieved 
images are expected to have a 

mix of blue and beige, but not necessarily a sky 
and sand. A desirable result 

for the retrieved candidate images would be images 
with blue at the top and 

beige at the bottom. In order to achieve this 
result, a further embodiment of 

the system according to the invention determines a 
color histogram for a number 

of respective regions of the query image and 
compares these determined 

histograms with histograms of corresponding regions 
of the candidate image. 

The query image may be divided into regions using 
prefixed boundaries, e.g. the 

division of the image into a number of rectangles. 
Furthermore, the regions 

may be indicated manually by the user taking into 
account important objects in 

the query image. In this way, the user forces that 
a histogram is made for a 

region comprising the object of interest. The 
choice of the region size is 

important since it governs the emphasis that is 
given to local information. In 

one extreme, the whole image is considered as a 
single region so that only 

global information is used for the comparison. In 
the other extreme, the 

region size matches the individual pixels. In one 
of the further embodiments 

of the retrieval system according to the invention, 
the images are divided into 
4. times. 4 rectangular regions. 
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Detailed Description Text - DETX (61) : 

The images in the database of the retrieval 

system according to the 

invention are organized into clusters so as to 
allow a search to images similar 
with a given query image without the need of 
comparing all images with the 

query image. According to the invention, clusters 
of images are defined 

whereby similar images are grouped in a same 
cluster and a cluster center is 

defined for such cluster which is representative of 
the images in the cluster. 

In an embodiment of the method of organizing the 
images in the database 

according to the invention, the images are 
clustered in a hierarchical way. 

The number of images in the database is n and the 
similarities between all 

pairs of images is precomputed. The calculation of 
the similarities between 

the candidate images in the database is carried out 
using the same feature 

vector described above for the calculation of the 
similarity between the query 

image and a candidate image, namely the color 
histograms of the relevant 

images. However, a different type of feature 
vector may be used since the 

process of clustering the images in the database is 

not directly linked to the 

process of searching the database. The 

hierarchical clustering is carried out 

as follows: 



Claims Text - CLTX (8): 

2. An image retrieval system according to claim 
1, in which at least one of 
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the cluster centers is represented by a color 
histogram which is the average of 
respective color histograms of a number of 
representative images in the 
particular cluster . 
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US-PAT-NO: 6621926 

DOCUMENT-IDENTIFIER: US 6621926 Bl 

TITLE: Image retrieval system and 

method using image histogram 

DATE-ISSUED: September 16, 2003 

INVENTOR-INFORMATION : 



NAME 






CITY 


STATE. 


ZIP CODE 


COUNTRY 


Yoon; Ho Sub 






Taejon 


N/A 


N/A 


KR 




Soh; Jung 






Tae j on 


N/A 


N/A 


KR 




Min; Byung Woo 






Tae j on 


N/A 


N/A 


KR 




Yang; Young Kyu 




Taejon 


N/A 


N/A 


KR 





US-CL-CURRENT: 382/168, 382/305 
ABSTRACT: 

An image retrieval system and method using an 
image histogram for 

determining central points and dispersion values as 
well as quantity 

information of color about respective histogram 

bins, thereby using these as 

mapping information for image retrieval . The image 
retrieval method using an 

image histogram includes the following steps. A 
first step of computing an 

image histogram bin when an image is inputted, and 
accumulating values of x, y, 
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x.sup.2, y.sup.2 to compute central points and 
dispersion values . A second 

step of normalizing the respective central points 
and dispersion values through 

dividing these by size of whole image, and storing 
these. A third step of 

generating a value of model to be retrieved by 
drawing a feature vector when a 
query image is inputted, and computing the 
difference between the generated 

value of model and central points and dispersion 
values of an image histogram, 

count, and number of corresponding bins within the 
data stored in the second 

step . A fourth step of specifying a similarity 
value of an image using the 
values computed in the third step . 

6 Claims, 3 Drawing figures 

Exemplary Claim Number: 1 

Number of Drawing Sheets: 3 

KWIC 



Abstract Text - ABTX (1) : 

An image retrieval system and method using an 
image histogram for 

determining central points and dispersion values as 
well as quantity 

information of color about respective histogram 

bins, thereby using these as 

mapping information for image retrieval . The image 
retrieval method using an 

image histogram includes the following steps. A 
first step of computing an 

image histogram bin when an image is inputted, and 
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accumulating values of x, y, 

x.sup.2, y.sup.2 to compute central points and 
dispersion values. A second 

step of normalizing the respective central points 
and dispersion values through 

dividing these by size of whole image, and storing 
these. A third step of 

generating a value of model to be retrieved by 
drawing a feature vector when a 
query image is inputted, and computing the 
difference between the generated 

value of model and central points and dispersion 
values of an image histogram, 

count, and number of corresponding bins within the 
data stored in the second 

step. A fourth step of specifying a similarity 
value of an image using the 
values computed in the third step. 

Brief Summary Text - BSTX (2) : 

The present invention relates to an image 
retrieval system and method using 

an image histogram, and, in particular, to an image 
retrieval system and method 

of using an image histogram, for determining 
central points and dispersion 

values as well as quantity information of color 
about respective histogram 

bins, thereby using these as mapping information 
for image retrieval . 

Brief Summary Text - BSTX (6) : 

This is because color feature values of the 
histogram, i.e., values of each 
bin show global feature information, and it is 
difficult to retrieve an image 

having correctly requested contents with only the 
global feature information. 



01/07/2004, EAST Version: 1.4.1 



That is, a global feature is advantageous for not 
being affected by rotation of 

the image or a slight change of position, but has a 

drawback of not containing 

any spatial information. Because of such 

characteristics of a global feature 

that does not contain spatial information, when 

retrieving with only color 

information, a false positive error in the 
retrieval result can occur. 



Brief Summary Text - BSTX (11) : 

The disclosed embodiments of the present 
invention provide an image 

retrieval system and method using an image 

histogram for finding central points 
and dispersion values as well as quantity 
information of color about respective 
histogram bins, thereby using these as mapping 
information for image retrieval . 

Brief Summary Text - BSTX (15) : 

Also, the disclosed embodiments provide a 
storage medium containing a 
program that executes steps, including the 
following steps. A first step is 
when an image is inputted, converting the image 
into color coordinate system, 

and normalizing it to reduce the feature of the 
converted values. A second 

step is computing histogram color bins from the 
normalized values in the first 

step, and accumulating x, y, x.sup.2, and y.sup.2, 
thereby computing central 

points and dispersion values. A third step is 
normalizing the respective 

computed central points and dispersion values by 
dividing with the size of 
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whole image, and storing it. A fourth step is when 
a query image is inputted, 

generating a value of model to be retrieved by 

drawing a feature vector, and 

then computing the difference between the generated 
value of model and the 

number, count of color values, and central points 
and dispersion values of bins 

corresponding to the stored data in the third step. 

And a fifth step is 
specifying the similarity values of an image using 
the computed values in the 
fourth step. 

Detailed Description Text - DETX (24): 

Although the present invention is illustrated 
and shown in connection with 

an image retrieval method using a color histogram, 

also in the case of a gray 

histogram, the image retrieval method using the 
mean value and the dispersion 

value, which are proposed in the present invention, 
can be applied. 

Claims Text - CLTX (3) : 

3. The image retrieval method according to 
claim 2, wherein the first step 

further comprises: a first sub-step for converting 
the image into a color 

coordinate system when a color image is inputted, 
and normalizing this to 

reduce the feature of the converted value; and a 
second sub-step fpr computing 

a color histogram bin from the normalized value in 
the first sub-step. 

Other Reference Publication - OREF (3) : 
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Rickman et al . "Content-Based Image Retrieval 
Using Colour Tuple 

Histograms" . SPIE V. 2670, p. 2-7, 1996.* 



01/07/2004, EAST Version: 1.4.1 



PUB-NO: WO009931605A1 

DOCUMENT-IDENTIFIER: WO 9931605 Al 

TITLE: IMAGE RETRIEVAL SYSTEM 

PUBN-DATE: June 24, 1999 



INVENTOR- INFORMATION : 

NAME 

COUNTRY 

ABDEL-MOTTALEB, MOHAMMED S N/A 
DESAI , RAN JIT N/A 



ASSIGNEE-INFORMATION : 

NAME 

COUNTRY 

KONINKL PHILIPS ELECTRONICS NV NL 
PHILIPS SVENSKA AB SE 



APPL-NO: IB09801983 

APPL-DATE: December 7, 1998 

PRIORITY-DATA: US99313097A ( December 18, 1997) 

INT-CL (IPC) : G06F017/30, G06K009/68 
ABSTRACT: 



01/07/2004, EAST Version: 1.4.1 



CHG DATE=19990803 STATUS=0>In an image retrieval 
system, a database with a 

large number of images is searched to find one or 
more images meeting the 

specification of a user. This specification is 
given in the form of a query 

image. The system determines the similarity 
between the query image and a 

particular image from the database by comparing the 
color histograms of the two 

images. The histograms are treated as statistical 
distributions and the 

similarity is determined on the basis of an 
information theoretic measure of 
the distributions. In a first embodiment, the 
similarity is determined using 

the Kullback informational divergence of the two 
histograms. In a second 

embodiment, the similarity is based on the entropy 
of the distribution of 

similarity coefficients of the two histograms is 
used. 
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(57) Abstract 

In an image retrieval system, a database with a large number of images is searched to find one or more images meeting the specification 
of a user. This specification is given in the form of a query image. The system determines the similarity between the query image and 
a particular image from the database by comparing the color histograms of the two images. The histograms are treated as statistical 
distributions and the similarity is determined on the basis of an information theoretic measure of the distributions. In a first embodiment, 
the similarity is determined using the Kullback informational divergence of the two histograms. In a second embodiment, the similarity is 
based on the entropy of the distribution of similarity coefficients of the two histograms is used. 
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BACKGROUND OF THE INVENTION 

The invention relates to an image retrieval system which includes a 
database with candidate images, an entry unit for entering a query image, and a first 
histogram unit for deriving a first query color histogram from the query image. 
5 A second histogram unit derives a first candidate color histogram from a 

particular candidate image. Also a determining unit determines a first similarity between the 
particular candidate image and the query image on the basis of the first candidate color 
histogram and the first query color histogram, and a retrieval unit retrieves of the particular 
candidate image. 

10 The invention further relates to a method for determining a similarity 

between a candidate image and a query image. 

A first step obtains the query image, a second step derives a query color 
histogram from die query image, a third step obtains a candidate color histogram from the 
candidate image, and a determining step determines the similarity between the particular 

15 candidate image and the query image on the basis of the candidate color histogram and the 
query color histogram. 

Image retrieval systems are of importance for applications that involve 
large collections of images. Professional applications include broadcast stations where a piece 
of a video may be identified through a set of shots and where a shot of video is to be 

20 retrieved according to a given image. Also movie producers must be able to find back scenes 
from among a large number of scenes. Furthermore, art museums have large collections of 
images, from their paintings, photos and drawings, and must be able to retrieve images on 
the basis of some criterion. Consumer applications include maintaining collections of slides, 
photos and videos, from which the user must be able to find back items. 

25 An image retrieval system and a method as described above, are known 

from the article "Tools and Techniques for Color Image Retrieval", John R. Smith and Shih- 
Fu Chang, Proc. SPIE - Int. Soc. Opt. Eng (USA), Vol. 2670, pp. 42^437. The image 
retrieval system includes a database with a large number of images. A user searching for a 
particular image specifies a query image as to how the retrieved image or images should look 
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like. Then the system compares the stored images with the query image and ranks the stored 
image according to their similarity with the query image. The ranking results are presented 
to the user who may retrieve one or more of the images. The comparison of the query image 
with a stored image to determine the similarity may be based on a number of features 
5 derived from the respective images. The article describes the usage of a color histogram as 
such a comparison feature. When using the RGB (Red, Green and Blue) representation of an 
image, a color histogram is computed by quantizing the colors within the image and counting 
the number of pixels of each color. To determine the similarity, a number of techniques are 
described to compare the two color histograms of the respective images. The histogram 

10 euclidean distance is a simple measure calculated by comparing identical bins in respective 
histograms. No cross-wise comparison is made between different bins which represent 
perceptually similar colors. Furthermore, techniques for determining a histogram intersection 
and techniques for determining a histogram quadratic distance are described. As an 
alternative to the histogram techniques, a comparison technique based on color sets is 

15 described. In this technique the color of a pixel is compared with a predetermined threshold. 
If the color is below the threshold, die pixel does not become a member of the set and 
otherwise it does become a member. A disadvantage is that a large number of pixels, all 
below the threshold, will not contribute in the comparison in any way . Furthermore, there is 
no discrimination between values above the threshold. The prior art techniques for 

20 determining the similarity between the candidate image and the query image are complex to 
execute and/or are occasionally not adequate enough. 

SUMMARY OF THE INVENTION 

It is an object of the invention to provide an image retrieval system of the 

25 kind set forth with an improved mechanism for determining the similarity between the 
candidate image and the query image. This object is achieved in an image retrieval system 
having the determining unit arranged to determine the first similarity on the basis of 
information conveyed by the first candidate color histogram in response to information 
requested by the first query color histogram. Determining the similarity between the 

30 respective images using an information theoretic measure is superior to the known 

techniques. The image retrieval system according to the invention is better able to establish 
the similarity between the query image and the images in the database. So, the image 
retrieval system according to the invention is superior in finding similar images and in 
avoiding images that are not similar enough. Furthermore, the calculation of the information 
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theoretic measure requires less computational effort than the known techniques. 

An embodiment of the image retrieval system according to the invention 
uses a Kullback informational divergence. The Kullback informational divergence is a 
measure for determining how different one statistical distribution is from another statistical 
5 distribution. The inventor has realized that a color histogram can be treated as a statistical 
distribution and that the Kullback informational divergence can be applied for comparing the 
ran<tiriate color histogram with the query color histogram. Experiments have shown that 
retrieval of images on the basis of a similarity obtained from applying the Kullback 
informational divergence on the respective color histograms gives very good results. 
10 Furthermore, the calculation of the Kullback informational divergence requires less 
computational effort than the known techniques, which is very important since a large 
number of candidate images may need to be compared with the query image. 

Another embodiment of the image retrieval system also considers entropy 
of similarity coefficients. By determining the entropy of the distribution of the similarity 
IS coefficients, an indication of the flatness of this distribution is obtained. Since a particular 
similarity coefficient indicates the similarity between the candidate color histogram and the 
query color histogram for the particular bin, the obtained flatness is a measure for the 
similarity of the candidate color histogram and the query color histogram over all bins. 
Experiments have shown that retrieval of images on the basis of a similarity based on the 
20 entropy measure gives very good results. Furthermore, the calculation of the entropy requires 
less computational effort than the known techniques, which is very important since a large 
number of candidate images may need to be compared with the query image. 

A further embodiment of the image retrieval system according to the 
invention compares two color histograms of respective regions of the candidate image with 
25 two color histograms of corresponding regions of the query image, the spatial information in 
the respective images being employed when determining the similarity. This improves the 
accuracy of determining the similarity between the candidate image and the query image and 
a better discrimination among the images in the database can be achieved. 

A still further embodiment of the image retrieval system according to the 
30 invention uses median statistics for determining the overall similarity from the similarities of 
the regions. This is better than simply averaging the similarities of the regions. The median 
statistics suppress the effect of large outliers which would negatively influence the perceived 
similarity of the candidate image and the query image. 

An embodiment of the image retrieval system according to the invention 
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allows the user to compose the query image. Such a query image can be completely specified 
according to the user's wishes. The user may compose the query image by taking samples 
from images available in the Systran or may sketch an image from scratch. 



5 forth with an improved step for determining the similarity between the candidate image and 
the query image on the basis of the candidate color histogram and the query color histogram. 
This object is achieved according to the invention in a method that is characterized in that the 
determining step includes determining the similarity on the basis of information conveyed by 
the candidate color histogram in response to information requested by the query color 

10 histogram. By determining the similarity between the respective images through an 

information theoretic measure better results are obtained. When this method is applied for 
searching an image, a better discrimination among the searched images with respect to the 
query image can be obtained. A further advantage of the method according to die invention 
is that the calculation of the information theoretic measure requires less computational effort 

15 than the known techniques, which is particularly important for searching through large 
collections of images. 



It is a further object of the invention to provide a method of the kind set 



Further advantageous embodiments of the invention are discuseed below. 



BRIEF DESCRIPTION OF THE DRAWINGS 



25 



20 




30 



Figure 5 shows an overview of the method according to the invention. 
Corresponding features in the various Figures are denoted by the same 



reference symbols. 
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DESCRIPTION OF THE PREFERRED EMBODIMENTS 

Figure 1 schematically shows an image retrieval system according to the 
invention. The system 100 includes a database 102 with a potentially large collection 104 of 
images. A purpose of such a system is to retrieve from the collection one or more images 
5 that match the wishes of a user of the system. Those wishes are specified via a query image 
106, which the user can enter into the system via entry means 108. The entry unit may allow 
the user to compose the query image from a number of existing images or to create the query 
image from scratch. The system compares the query image with the candidate images in the 
database and determines for each candidate image how similar it is to the query image. The 

10 system ranks the candidate images according to the established similarity. The system 100 
compares images on the basis of their color histogram. To this end, the system includes first 
histogram unit 110 to determine a query color histogram 112 from the query image 106. The 
process of determining a color histogram from an image is explained in Figure 3 below. The 
system also includes second histogram unit 114 to determine a candidate color histogram 116 

15 from a particular candidate image 118. The first histogram unit and the second histogram 
unit may be integrated into one histogram means, which can act on the query image for 
generating the query color histogram and on the particular candidate image for generating the 
candidate color histogram respectively. The system further includes determining unit 120 to 
determine a similarity 122 on the basis of the quay color histogram 112 and the candidate 

20 color histogram 116. Based on the similarity, the system presents a ranking of the candidate 
image on a display 126. The user may select an image from this ranking which is retrieved 
from the database via retrieval means 124 for further processing. This further processing 
may include temporarily storing the image in a file 128 for further selection. This may be 
implemented as that the system retrieves a number of candidate images and stores these in 

25 the file 128, from where the user makes a final selection as to which image is desired. In 
such a way of working, the system makes a first selection from the large collection in die 
database 102 and the user selects the image or images from the much smaller collection in 
file 128. 

In a first embodiment of the image retrieval system according to the 
30 invention, the two color histograms between which a similarity must be determined are 
treated as two probability distributions. The question as how similar the two histograms arc, 
can then be answered by measuring how different the one probability distribution is from the 
other. This difference between two statistical distributions is called informational divergence 
or Kullback informational divergence and is calculated with the following equation: 
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D(Q^P) - E GWlog|^ (1) 



In which: 

Q(x) is the normalized query color histogram, 
5 P(x) is the normalized candidate color histogram, and 
D(Q H P) is the Kullback informational divergence. 

A more detailed discussion on the Kullback informational divergence is 
presented in the textbook "Information Theory: Coding Theorems for Discrete Memoryless 
10 Systems" , I. Csizar and J. Korner, Akademia Kiado, Budapest, 1981, pages 19-22. 
Equation (1) can be rewritten to 



omn - E fiWto«ow - E QMto&PV) < 2 > 

xtX xtX 



15 The first term in equation (2) is the entropy of distribution Q(x) and is fully determined by 
the contents of the query. Therefore this first term is the same for all candidate images of the 
database and need not be considered when ranking the candidate images with respect to 
similarity to the query image. According to this first embodiment of the image retrieval 
system according to the invention, the similarity between the candidate image and the query 

20 image is therefore calculated with the following equation: 

S/QJ) = E QMtogm (3) 

xtX 



In which: 

25 S K (Q,P) is the similarity between the candidate image and the query image, 

Q(x) is the normalized query color histogram, and 

P(x) is the normalized candidate color histogram. 

The value of Sk(0,P) is used to rank the candidate image with respect to 

their similarity with the query image. A relatively large value indicates that two images are 
30 similar and a relatively low value indicates that two images are dissimilar. 
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In a second embodiment of the image retrieval system according to tfie 
invention, similarity coefficients are determined for each pair of corresponding bins of the 
two color histograms between which a similarity must be determined. Subsequently the 
obtained collection of similarity coefficients is treated as a probability distribution and the 
5 question as how similar the two histograms are, is then answered by analyzing this 

probability distribution. In this embodiment, the similarity coefficients are calculated using 
the following equation: 



10 



20 



rf?,® . *jM> (4) 



In which: 

rj(P»Q) is die similarity coefficient between bin i of the candidate color histogram and bin i 
of the quay color histogram, 

Pi is the number of pixels in bin i of the candidate color histogram, and 
IS qj is the number of pixels in bin i of the query color histogram. 

Especially in cases where the candidate images in the database have 
significantly different color histograms, comparison on the basis of the similarity coefficients 
as such is not sufficient. Therefore the distribution of the similarity coefficients over the bins 
is analyzed. First the distribution is normalized using the following equation: 



In which: 

S; is an element of the normalized probability distribution S, 
25 r s is calculated using equation (4), and 

N is the number of bins. 

The flatness of the distribution S is used in addition to the similarity 

coefficients themselves for determining the similarity between the candidate color histogram 

and the query color histogram. A flat distribution indicates a good overall match, while one 
30 with few peaks indicates a good match over a few bins. The level of flatness of the 
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probability distribution S is measured by calculating its entropy using the following equation: 



5 In which: 

H(S) is the entropy of distribution S, 

Sj is an element of the distribution S, calculated using equation (5), and 
N is the number of bins. 

H(S) lies in the range [0,log(N)l- H(S) = log(N) indicates that the 
10 similarity coefficients of all bins are equal, i.e. r, = r jf i,j element of [0.N-1]. The value 
H(S) = 0 indicates that there is at most one histogram bin over which the histograms P and 
Q are similar. In this embodiment of the image retrieval system according to the invention, 
the similarity is obtained by combining the entropy H(S) and the sum of the similarity 
coefficients using the following equation: 



15 



j-o 



In which: 

S B (Q,P) is the similarity between the candidate image and the query image, 

20 H(S) is the entropy according to equation (6) t and 

^ is the similarity coefficient according to equation (4). 

S^P) lies in the range [0,Nlog(N)]. A larger value of Se(Q,P) indicates 
a higher similarity between the candidate color histogram P and the query color histogram Q. 
If S E (Q,P) = 0, P and Q are very dissimilar. If Se(Q,P) = Nlog(N), P and Q are identical. 

25 In the embodiments of the image retrieval system described above, a 

single color histogram is made from the whole image. Because of this, the spatial 
information from the image is lost and the comparison of two images reflects only global 
similarity. For example if a user enters a query image with a sky at the top and sand at the 
bottom, the retrieved images are expected to have a mix of blue and beige, but not 

30 necessarily a sky and sand. A desirable result for the retrieved candidate images would be 
images with blue at the top and beige at the bottom. In order to achieve this result, a further 
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embodiment of the system according to the invention determines a color histogram for a 
number of respective regions of the query image and compares these determined histograms 
with histograms of corresponding regions of the candidate image. The query image may be 
divided into regions using pre-fixed boundaries, e.g. the division of the image into a number 
5 of rectangles. Furthermore, the regions may be indicated manually by the user taking into 
account important objects in the query image. In this way, the user forces that a histogram is 
made for a region comprising the object of interest. The choice of the region size is 
important since it governs the emphasis that is given to local information. In one extreme, 
the whole image is considered as a single region so that only global information is used for 

10 the comparison. In the other extreme, the region size matches the individual pixels. In one of 
the further embodiments of the retrieval system according to the invention, the images are 
divided into 4x4 rectangular regions. 

Figure 2 schematically shows an image retrieval system according to the 
invention with multiple color histograms per image. In this system, the first histogram means 

15 110 determine a first query color histogram 202 of a first region of the query image 106 and 
a second query color histogram 204 of a second region of the query image 106. In the same 
way, the second histogram means 114 determine a first candidate color histogram 206 of a 
first region of the particular candidate image 118 and a second candidate color histogram 208 
of a second region of the particular candidate image 118. The example in Figure 2 shows 2 

20 color histograms per image, but this is mainly for the purpose of illustration since in practice 
the system will have more than 2 color histograms per image, for instance 8 or 16. In a 
subsequent step the determining means 120 of the system makes multiple pair-wise 
comparisons of the respective color histograms and determines a similarity for each 
comparison. These individual similarities are combined into one overall similarity indicating 

25 how similar the candidate image and the query image are, taking into account the local 
information. The detennining means determine a first similarity 210 on the basis of the first 
query color histogram 202 and the first candidate color histogram 206. This first similarity 
210 indicates how similar is the first region of the query image 106 to the first region of the 
candidate image 118. The detennining means further determine a second similarity 212 on 

30 the basis of the second query color histogram 204 and the second candidate color histogram 
208. This second similarity 212 indicates how similar is the second region of the query 
image 106 to the second region of the candidate image 118. Subsequently, the determining 
means determine an overall similarity 214 on the basis of the first similarity 210 and the 
second similarity 212. This overall similarity 214 indicates the similarity between the query 
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image 106 and the candidate image as a whole, taking into account the local information 
captured through the division in regions. The overall similarity 214 is used to rank the 
candidate images stored in the database 102. 

Combining the region similarities corresponding to the respective regions 
5 of the query image and the candidate image into an overall similarity should avoid that too 
much emphasis is put on any one the region similarities. Therefore, the further embodiments 
of the system according to the invention use the median of the region similarities as a 
measure of the similarity for the whole image. In the further embodiment of the system using 
the Kullback informational divergence, the overall similarity between the candidate image 
10 and the quay image, based on similarities of respective regions of the images is calculating 
according to the following equation: 



15 In which: 

Iq is the query image, 

I P is the particular candidate image, 

3 k (Iq, I P ) is the overall similarity between image P and Q, 
Qu is the color histogram of region k,l of the query image, 
20 P u is the color histogram of region k,l of the particular candidate image, 

Sk(Qu> Pu) is the similarity between region k,l of the candidate image and region k,l of the 
query image, based on the Kullback informational divergence according to equation (3), and 
M is the number of regions into which die image is divided in die horizontal and in the 
vertical direction. 

25 The median function sorts the individual region similarities and selects the 

middle one to be the overall similarity. 

In the further embodiment of the system using the entropy measure, the 
overall similarity between the candidate image and the query image, based on similarities of 
respective regions of the images is calculated according to the following equation: 



30 



§ £W = fkM^M^ms^Q^} (9) 
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In which: 

Iq is the query image, 

I P is the particular candidate image, 

5 e (Iq, Ip) is the overall similarity between image P and Q, 
5 Qu is the color histogram of region k,l of the query image, 

P u is the color histogram of region k,l of the particular candidate image, 
Se(Qu> iS the similarity between region k,l of the candidate image and region k,l of the 
query image, based on the entropy measure according to equation (7), and 
M is the number of regions into which the image is divided in the horizontal and in the 
10 vertical direction. 

Figure 3 shows the process of determining a color histogram from an 
image. Color images in the system according to the invention are represented by the three 
color components of the RGB (Red, Green and Blue) color space. However, the invention 
can also be applied for images represented in another color space. A histogram is constructed 

15 by independently quantizing the Red component 302, the Green component 304 and the Blue 
component 306 of every pixel. This color quantization results in the representation of the 
entire color spectrum by a smaller set of discrete values referred to as quantization levels. 
This is a many-to-one mapping and the set of colors mapped to the same quantization level 
forms a quantization cell. The number of quantization levels is referred to as q L . A histogram 

20 is built by uniformly quantizing the R, G and B components of every pixel, mapping the 
three quantized values 308, 310 and 312 to a composite color value 314, and incrementing 
the corresponding histogram bin. The quantized color components r, g and b are mapped to a 
1 -dimensional composite space using the following equation: 

25 C c (r,g,b) ^o^xr + OgXg + ObXb (10) 

In which: 

C c is the composite color value, 
Of is the mapping coefficient for the R component, 
30 a g is the mapping coefficient for the G component, 
is the mapping coefficient for the B component, 

Each quantized component r, g and b takes a value between 0 and (q^ 1) 
and there are (qj 3 quantized possible combinations. To ensure a unique mapping from the R, 
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G and B components to a composite color value the following scheme is chosen for the" 
mapping coefficients: a T = (qj 2 , a g = (qj\ and a b = (qj 0 - 1. After each pixel has been 
mapped to a composite color value and the histogram bins have been filled with the number 
of corresponding appearances, the histogram is normalized for further use. Normalization 
5 comes down to dividing the number of pixels in each bin by the total number of pixels in all 
bins. After normalization, a bin contains a number representing the fraction of the pixels 
belonging to that bin rather than a number representing the sum of those pixels. Throughout 
this document, a reference to a color histogram is generally to be considered as a reference 
to a normalized color histogram. 
10 In the further embodiments of the image retrieval system according to the 

invention, multiple histograms are generated from a single image. Each of the multiple 
histogram is a histogram of a specific region of the image. In the example of Figure 3, 4 
histograms, 316, 318, 320 and 322, are generated for the four indicated regions of the 
image. 

15 Figure 4 shows the most important components of the image retrieval 

system according to the invention. The image retrieval system 400 is implemented according 
to a known architecture and can be realized on a general purpose computer. The image 
retrieval system has a processor 402 for carrying out instructions of an application program 
loaded into working memory 404. The image retrieval system further has an interface 406 

20 for communication with peripheral devices. There is a bus 408 for exchange of commands 
and data between the various components of the system. The peripherals of the image 
retrieval system include a storage medium 410 containing the executable programs, the 
database with images, and various other data. The storage medium 410 can be realized as 
various separate devices, potentially of different kind of storage device. Application of the 

25 invention is not restricted by the type of device and storage devices which can be used 

include optical disc, magnetic disc, tape, chip card, solid state or some combination of these 
devices. Furthermore, some of the data may be at a remote location and the image retrieval 
system may be connected to such a location by a network via connection 411. The 
peripherals of the image retrieval system further include a display 412 on which the system 

30 displays, amongst others, the query image and the candidate images. Furthermore the 

peripherals preferably include a selection device 414 and a pointing device 416 with which 
the user can move a cursor on the display. Devices 414 and 416 can be integrated into one 
selecting means 418 like a computer mouse with one or more selection buttons. However, 
other devices like a track ball, graphic tablet, joystick, or touch sensitive display are also 
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possible. In order to cany out the various tasks, a number of software modules are loaded 
into the working memory 404, among which are modules constituting: entry means 108, first 
histogram means 110, second histogram means 114, determining means 120 and retrieval 
means 124. Furthermore, the working memory 404 has memory space 420 for temporarily 
5 storing input and output data and intermediate results, like the respective histograms and the 
determined similarity. 

Figure 5 shows an overview of the method according to the invention. In 
a first step 502, a query image is obtained containing the wishes of the user. This image may 
be composed from existing images or may be sketched by the user, possibly on the basis of 

10 an existing image. Then in a second step 504, a query color histogram of the query image is 
determined. This query color histogram will be used in comparing the query image with 
ran<1iriatp. images from a database. In a third step 506, a candidate color histogram of one of 
such candidate images is obtained. Preferably this candidate color histogram has been 
prepared in advance at the moment the candidate image had been stored in the database. 

15 Then obtaining the candidate color histogram now, comes down to simply retrieving the 
histogram. Alternatively, die candidate color histogram could be created at this instant, i.e. 
at the time when it is needed. When the candidate color histogram has been obtained, the 
similarity between the query image and the candidate image is determined in a determining 
step 508. If in a comparison step 510 it is ascertained that the images are similar enough, the 

20 particular candidate image is retrieved from the database in retrieval step 512. The particular 
candidate image may be directly presented to the user or may be temporarily stored in a file 
for later inspection. Then in step 514 it is determined whether all candidates images in the 
database have been dealt with. If this is not the case, a candidate color histogram of a next 
candidate image is obtained in step 506 and the process is repeated for this next candidate 

25 image. 
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An image retrieval system comprising: 
a database with candidate images, 
entry means for entering a query image, 

first histogram means for deriving a first query color histogram from the query 



5 image, 



second histogram means for deriving a first candidate color histogram from a 
particular candidate image, 

determining means for determining a first similarity between the particular 
candidate image and the query image on the basis of the first candidate color histogram and 
10 the first query color histogram, and 

retrieval means for retrieval of the particular candidate image, 
the determining means being arranged to determine the first similarity on the basis of 
information conveyed by the first candidate color histogram in response to information 
requested by the first query color histogram. 
IS 2. An image retrieval system according to Claim 1, wherein the determining 

means are arranged to determine the first similarity on the basis of the Kullback 
informational divergence between the first candidate color histogram and the first query color 
histogram. 

3. An image retrieval system according to Claim 2, wherein the determining 

20 means are arranged to determine the first similarity according to the following equation: 

xtX 



4. An image retrieval system according to Claim 1, wherein the determining 
25 means are arranged to determine the first similarity on the basis of the entropy of the 

distribution of similarity coefficients over the bins of the first candidate color histogram and 
of the first query color histogram. 

5. An image retrieval system according to Claim 4, wherein the determining 
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means are arranged to determine the first similarity according to the following equation: 



S£P 9 Q) = H(S) x £ 0 



5 6. An image retrieval system according to Claim 1, wherein 

the first histogram means are arranged to derive the first query color histogram 
from a first region of the query image and to derive a second query color histogram from a 
second region of the query image, 

the second histogram means are arranged to derive the first candidate color 
10 histogram from a first region of the candidate image and to derive a second candidate color 
histogram from a second region of the candidate image, and 

the determining means are arranged to determine a second similarity between 
the particular candidate image and the query image on the basis of the second query color 
histogram and the second candidate color histogram and to determine an overall similarity 
15 between the particular candidate image and the query image on the basis of the first 
similarity and the second similarity. 

7, An image retrieval system according to Claim 6, wherein the determining 

means are arranged to determine the overall similarity using median statistics for combining 
the first similarity and the second similarity. 
20 8. An image retrieval system according to Claim 1, wherein the entry means 

are arranged to enable the user to compose the query image. 

9. A method for determining a similarity between a candidate image and a 

query image, the method comprising the following steps: 

a first step for obtaining the query image, 
25 a second step for deriving a query color histogram from the query image, 

a third step for obtaining a candidate color histogram from the candidate image, 

and 

a determining step for determining the similarity between the particular 
candidate image and the query image on the basis of the candidate color histogram and the 
30 query color histogram, 

the determining step including determining the similarity on the basis of information 
conveyed by the candidate color histogram in response to information requested by the query 
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color histogram. 

10. A method according to Claim 10, wherein the determining step includes 

determining the similarity on the basis of the Kullback informational divergence between the 
candidate color histogram and the query color histogram. 
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ABSTRACT: 

In one example given, a query for identifying 
interrelationships between 

image objects of a set of image objects is received 
from an input device. Each 

of a plurality of similarity values between all 
image objects of the set is 

compared with threshold criteria from the query. 
Clusters of image object 

identifiers are generated based on comparing and 
are visually displayed on a 
visual output device. 

10 Claims, 16 Drawing figures 

Exemplary Claim Number: 1 

Number of Drawing Sheets: 11 
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KWIC 



Detailed Description Text - DETX (5) : 

Object server 106 is the repository for image 
objects stored in computer 

system 100. Users store and retrieve image objects 
from object server 106 

through requests routed by library server 104. 
Object server 106 manages 

storage resources based on the storage management 
entities (such as volumes) 

that are defined through a system administration 
program. A database on object 

server 106 contains data about the exact location 
of each object. The database 

can be, for example, the IBM DB2 Universal Database 
or Oracle. 



Detailed Description Text - DETX (8) : 

Users of client 102 of FIG. 1 can execute 
conventional image queries using 

the visual properties of images to match colors, 
textures, and their positions 
without having to describe them in words. 
Content-based queries can be 

combined with text and keyword searches for useful 
retrieval methods in image 

and multimedia databases. An image query includes 
a character string that 

specifies the search criteria for an image search. 
The search criteria 

typically includes (1) a feature name, which 
designates the feature to be used 
in the search; (2) a feature value, which 
corresponds to the value of the 

feature used; (3) a feature weight, which indicates 
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the emphasis placed on the 

feature when calculating scores and returning 
results; and (4) a maximum number 
of results desired. 



Claims Text - CLTX (8) : 

8. An apparatus, comprising: a similarity value 
generator, said similarity 

value generator operative to generate similarity 
values between image features 

of all image objects of a set of image objects, 
wherein the image features 

include average color, histogram color, positional 
color and texture; and a 

cluster generator, said cluster generator operative 
to generate at least one 

subset of image object identifiers based on a first 
comparison between a 

threshold criteria and each one of a first 
plurality of similarity values, the 
first plurality of similarity values being between 
a first image object and 

other image objects of the set, wherein the 
threshold criteria is a fixed range 
of values designated by a user indicating a 
required degree of similarity 

between the image features of the image objects in 
a cluster, and, in response 

to determining that a similarity value between the 
first image object and a 

second image object meets the threshold criteria, a 
second comparison between 

the threshold criteria and each one of a second 
plurality of similarity values, 

the second plurality of similarity values being 
between the second image object 

and other image objects of the set; wherein said 
cluster generator operative 

to logically group image object identifiers 
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associated with the first, second, 

and third image objects into a cluster if a 

similarity value between the second 

image object and a third image object meets the 

threshold criteria. 



Claims Text - CLTX (9) : 

9. A method of identifying subsets of 
interrelated image objects from a set 
of image objects, comprising: comparing image 
features of each image object of 
the set with image features of all other image 
objects of the set, wherein the 
image features compared include average color, 
histogram color, positional 

color and texture; generating a similarity value 
for each comparisons- 
comparing a threshold criteria designated by a user 
in a query with each one of 

a first plurality of similarity values, the first 
plurality of similarity 

values being between a first image object and other 
image objects of a set, 

wherein the threshold criteria is a fixed range of 
values designated by the 

user indicating a required degree of similarity 

between the image features of 

the image objects in a cluster; in response to 
determining that a similarity 

value between the first image object and a second 
image object meets the 

threshold criteria, comparing the threshold 

criteria with each one of a second 

plurality of similarity values, the second 

plurality of similarity values being 

between the second image object and other image 

objects of the set; and in 

response to determining that a similarity value 
between the second image object 
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and a third image object meets the threshold 

criteria, logically grouping image 

object identifiers associated with the first, 

second, and third image objects 

into a cluster. 



Claims Text - CLTX (10) : 

10. A computer software product, comprising: 
similarity value generator 

code, said similarity value generator code 
executable to generate a similarity 
value for image features of each image object and 
all other image objects of a 

set of at least four image objects in response to a 
single request, wherein the 

image features include average color, histogram 
color, positional color and 

texture; cluster generator code, said cluster 
generator code executable to 

generate at least one cluster of image object 
identifiers, wherein the cluster 

generator code further includes: primary compare 
software, said primary compare 

software executable to compare a threshold criteria 
designated by a user in a 

query with each one of a first plurality of 
similarity values, the first 

plurality of similarity values being between a 
first image object and other 

image objects of the set, wherein the threshold 

criteria is a fixed range of 

values designated by the user indicating a required 
degree of similarity 

between the image features of the image objects in 
a cluster; and secondary 

compare software, said secondary compare software, 
in response to determining 

from said primary compare software, that a 
similarity value between the first 
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image object and a second image object meets the 
threshold criteria, said 

secondary compare software executable to compare 
the threshold criteria with 

each one of a second plurality of similarity 

values, the second plurality of 

similarity values being between the second image 
object and other image objects 
of the set. 
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